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Abstract  1.  Introduction 

The  Management  System  for  Heterogeneous  Networks  The  Management  System  for  Heterogeneous  Networks 

(MSHN)  is  a  resource  management  system  for  use  in  (MSHN*)  project  seeks  to  determine  an  effective  design 

heterogeneous  environments.  This  paper  describes  the  for  a  resource  management  system  (RMS)  that  can  dehver, 

goals  of  MSHN,  its  architecture,  and  both  completed  and  whenever  possible,  the  required  quahty  of  service  (QoS)  to 

ongoing  research  experiments.  MSHN’s  main  goal  is  to  individual  processes  that  are  contending  for  the  same  set 

determine  the  best  way  to  support  the  execution  of  many  of  distributed,  heterogeneous  resources.  Factors 

different  applications,  each  with  its  own  quality  of  service  influencing  QoS  requirements  include  security,  user 

(QoS)  requirements,  in  a  distributed,  heterogeneous  preferences  for  different  versions  of  an  apphcation,  and 

environment.  MSHN’s  architecture  consists  of  seven  deadhnes.  A  set  of  QoS  requirements,  considered  together 

distributed,  potentially  replicated  components  that  with  resource  availability,  determine  whether  aU 

communicate  with  one  another  using  CORE  A  (Common  processes’ requirements  can  be  met. 

Object  Request  Broker  Architecture).  MSHN’s  An  RMS,  also  sometimes  called  a  meta-computing 

experimental  investigations  include:  (1)  the  accurate,  system,  is  similar  to  a  distributed  operating  system  in  that 

transparent  determination  of  the  end-to-end  status  of  it  views  the  set  of  machines  that  it  manages  as  a  single 

resources;  (2)  the  identification  of  optimization  criteria  virtual  machine  [51].  Also,  like  any  distributed  operating 

and  how  non-determinism  and  the  granularity  of  models  system,  it  attempts  to  give  the  user  a  location-transparent 

affect  the  performance  of  various  scheduling  heuristics  view  of  the  virtual  machine.  Hence,  as  in  the  case  of  a 

that  optimize  those  criteria;  (3)  the  determination  of  how  distributed  operating  system,  an  RMS  provides  users  with 

security  should  be  incorporated  between  components  as  improved  performance  while  the  location  of  resources  is 

well  as  how  to  account  for  security  as  a  QoS  attribute;  and  hidden.  The  set  of  users  of  a  system,  which  consists  of 

(4)  the  identification  of  problems  inherent  in  application  both  local  and  remote  resources,  that  is  managed  by  an 

and  system  characterization.  RMS  should  be  able  to  attain  a  higher  level  of  availability 

and  more  fault  tolerance  than  would  be  available  from 
their  local  system  alone. 
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An  RMS  differs  from  a  distributed  operating  system  in 
that  it  does  not  micro-manage  the  resources  of  each 
computer.  Instead,  each  computer  runs  its  native 
operating  system.  Similarly,  each  router  executes  its  own 
protocol  and  each  file  server  executes  a  native  distributed 
file  system.  The  RMS  is  responsible  for  identifying  the 
large-grained  resources,  i.e.,  compute  servers  and  data 
repositories  that  should  be  used  by  each  process,  if  there  is 
a  choice.  It  may  be  responsible  for  issuing  a  command  to 
begin  execution  of  the  processes  that  comprise  an 
apphcation.  It  may  monitor  the  status  of  both  the 
resources  in  the  system  and  the  progress  of  the 
apphcations  for  which  it  is  responsible. 

It  is  unclear  whether  every  request  to  execute  an 
apphcation  that  is  submitted  to  any  operating  system  on 
any  of  the  machines  in  the  distributed  system  must  be 
controlled  by  the  RMS.  If  aU  requests  are  controlled  by 
the  RMS,  then  allocation  pohcies  that  attempt  to  optimize 
throughput  for  a  set  of  well-understood  apphcations  will 
perform  better.  However,  sometimes  users  wish  to 
maintain  control  over  which  resources  their  apphcation 
wih  use. 

There  are  many  active,  on-going  research  projects,  in 
addition  to  MSHN,  in  the  area  of  resource  management, 
and  there  are  many  major  research  problems  to  be  solved. 
A  problem  that  MSHN  is  not  addressing  is  the  best  way 
for  such  a  system  to  interact  with  human  users  to  obtain 
their  QoS  preferences  and  requirements  in  the  most  user- 
friendly  way.  Indeed,  simply  identifying  the  syntax  and 
semantics  required  to  express  ah  of  the  QoS  preferences 
and  requirements  is  a  difficult  problem  [13]  [17]  [37]  [55]. 
While  MSHN  does  not  address  this  problem,  the  designers 
of  MSHN  expect  to  leverage  results  from  research  in  this 
area.  They  assume,  for  example,  that  a  request  to  execute 
an  apphcation  is  accompanied  by  a  list  of  deadhnes, 
preferences  for  various  versions  of  an  apphcation,  security 
requirements,  and  any  restrictions  on  the  variance  of  the 
time  at  which  a  request  should  be  completed. 

Before  leaving  the  general  topic  of  RMSs,  it  is 
imperative  that  we  address  the  topic  of  “packaging.” 
MSHN  researchers  do  not  see  the  fruits  of  the  RMS 
research  as  a  large,  monohthic  piece  of  software  that  wih 
require  its  own  separate  instahation  and  maintenance.  The 
best  way  to  package  the  eventual  outcomes  of  the  RMS 
projects  may  be  to  incorporate  them  into  an  infrastructure- 
or  middleware-level  standard  similar  to  the  Common 
Object  Request  Broker  Architecture  (CORBA),  Domain 
Name  Services,  or  other  such  resource  location  services. 
In  this  way,  an  RMS  would  not  need  to  be  separately 
maintained  and  would  be  consohdated  with  the  services 
that  distributed  apphcations  wih  most  often  use. 
However,  it  is  still  worthwhile  to  separate  research  on 
RMSs  from  research  in  ah  other  aspects  of  distributed 
object  computation  that  will  be  needed  in  future  versions 


of  such  standards  in  order  to  first  isolate,  then  solve  some 
of  the  difficult  resource  management  problems. 

1.1.  Background 

MSHN  evolved  in  part  from  a  scheduhng  framework 
cahed  SmartNet  [19]  [29].  SmartNet’s  goal  was  to  be  able 
to  wisely  schedule  sets  of  compute-intensive  jobs,  some  of 
which  may  require  the  execution  of  multiple  processes, 
onto  members  of  a  suite  of  heterogeneous  computers. 
SmartNet  provides  a  sophisticated  scheduhng  module  that 
had  been  successfully  integrated  with  many  RMSs  and 
distributed  computing  environments.  Hence,  users  who 
need  to  execute  compute-intensive  jobs  and  have  access  to 
a  shared,  heterogeneous  environment  can  achieve  superior 
performance,  while  continuing  to  work  in  an  environment 
to  which  they  have  grown  accustomed  [23].  Additionally, 
for  those  users  who  do  not  already  have  one  installed, 
SmartNet  provided  a  basic  RMS  that  makes  use  of  its 
sophisticated  scheduhng  capabihties.  SmartNet’s  major 
research  contributions  include; 

•  The  abihty  to  predict  the  expected  run-time  of  a  job 
on  a  machine  using  the  concept  of  compute 
characteristics  and  information  collected  from 
previous  executions  of  the  job. 

•  The  abihty  to  leverage  the  heterogeneity  inherent 
in  both  a  collection  of  jobs  as  well  as  in  a 
collection  of  computers. 

SmartNet  was  used  successfully  by  DoD  and  the 
National  Institutes  of  Health  in  scheduhng  their  compute¬ 
intensive  jobs,  and  by  NASA’s  EOSDIS  system  in 
determining  whether  their  resources  were  adequate  to 
process  data  in  the  ways  desired  by  their  scientists. 

SmartNet’s  scheduhng  algorithms  are  tuned  to  attempt 
to  minimize  the  time  at  which  the  last  job  completes, 
although  the  designers  of  SmartNet  recognized  that  similar 
algorithms  may  be  useful  in  optimizing  other  criteria.  Of 
course,  minimizing  the  time  at  which  the  last  job,  of  a  set 
of  jobs,  completes  is,  in  general,  an  NP-complete  problem, 
so  SmartNet  employs  heuristics  when  it  searches  for  a 
near-optimal  mapping  of  jobs  to  machines  and  job 
execution  schedule.  Many  of  the  heuristics  that  it  uses  are 
well  known  and  previously  documented,  however,  they 
had  not  previously  been  used  in  a  practical  heterogeneous 
computing  system  [25].  It  is  likely  that  they  were  not 
previously  used  in  actual  systems  because  system 
designers  had  not  tried  to  estimate  average  process  run¬ 
times  and  because  it  was  not  previously  recognized  that 
exact  run-times,  though  helpful,  were  not  necessary 
[2]  [3]  [28]. 

1.2.  Overview  of  MSHN’s  goals 

MSHN  differs  from  SmartNet  in  three  major  ways. 
First,  SmartNet  was  expected,  from  the  beginning,  to  be  a 


system  that  would  actually  be  used  in  production.  For  this 
reason,  much  of  the  SmartNet  developers’  time  was  spent 
ensuring  that  SmartNet  was  at  SEI  Level  3.  Despite  this, 
SmartNet  was  able  to  make  significant  research 
contributions.  MSHN  is  intended  to  be  a  research  system, 
facihtating  experiments  by  the  investigators  to  determine 
how  RMSs,  that  have  somewhat  broader  goals  than 
SmartNet,  can  be  built.  MSHN’ s  research  goals  expanded 
upon  SmartNet’ s  in  the  following  areas. 

(i)  MSHN  needs  to  consider  that  the  overhead  of 
jobs  sharing  resources,  such  as  networks  and  file 
servers,  can  have  significant  impact  on  mapping 
and  scheduhng  decisions. 

(ii)  MSHN  must  support  adaptive  apphcations 
(defined  below). 

(hi)  MSHN  must  dehver  good  QoS  to  many  different 
sets  of  simultaneous  users,  some  of  whom  may 
be  executing  interactive  jobs;  others,  compute¬ 
intensive  jobs;  and  still  others,  real-time 
requirements. 

In  SmartNet’ s  model,  apphcations  consist  of  three 
distinct  phases.  In  the  first  phase,  which  is  short  compared 
to  the  second  phase,  they  acquire  data  from  a  data 
repository.  In  the  second  phase,  they  compute  results 
based  upon  the  data  that  they  obtained  during  the  first 
phase.  In  the  third  phase,  which  is  again  very  short 
compared  to  the  second  phase,  they  write  the  result  back  to 
a  possibly  different  repository.  Because  the  first  and  third 
phases  are  so  short,  SmartNet’ s  heuristics  assume  that 
there  is  no  contention  for  either  the  network  or  the  data 
repositories.  However,  they  do  account  for  the  time 
required  to  access  the  resources,  assuming  that  each 
apphcation  is  the  sole  user  of  those  resources.  The  model 
of  apphcations  that  MSHN  is  meant  to  manage  is  more 
complex,  permitting  apphcations  to  transition  through 
many  more  phases  of  variable  length,  each  requiring  not 
only  sharing  of  compute  resources,  but  also  sharing  of 
network  and  data  repository  resources.  We  discuss  briefly 
in  this  paper,  and  elaborate  elsewhere,  both  the  problem  of 
modehng  the  apphcation  and  that  of  accounting  for  lower 
level  pohcies  that  govern  the  sharing  of  resources.  That  is, 
because  MSHN  does  not  assume  that  it  has  any  control 
over  network  routing,  file  server  memory  allocation,  etc., 
it  models,  when  necessary,  the  lower  level  operating 
systems  and  protocols.  By  doing  so,  the  assignment  of 
processes  to  resources  will  account  for  the  sharing  of  those 
resources  in  the  correct  way. 

The  second  major  difference  between  SmartNet  and 
MSHN’s  research  goals  is  that  MSHN  attempts  to  provide 
support  for  adaptive  and  adaptation-aware  apphcations. 
By  adaptive  apphcations,  we  mean  idempotent 
apphcations  that  can  exist  in  several  different  versions. 
Different  versions  may  have  different  values  to  a  user  due 
to  factors  such  as  precision  of  computation  or  input  data. 
Additionahy,  different  versions  may  have  different 


communication  and  computation  needs.  Or,  one  version 
may  execute  on  Windows  NT  while  another  version  is  an 
executable  for  Linux.  MSHN’s  goal  is  to  support  adaptive 
apphcations  by  being  able  to  terminate  one  version  of  an 
apphcation  if  MSHN  perceives  that  the  currently 
executing  version  will  not  meet  the  users’  QoS 
expectations.^  In  that  case,  MSHN  would  terminate  the 
executing  version  and  start  up  another  version  from  the 
beginning  (if  there  were  sufficient  resources  to  execute 
that  other  version).  The  requirement  that  adaptive 
apphcations  be  idempotent  permits  the  apphcation  to  be 
safely  restarted  from  the  beginning  without  corrupting  any 
resource  such  as  a  database.  Similarly,  there  may  be  times 
when  MSHN  determines  that  dehvery  of  a  better  QoS  is 
possible  to  a  user  by  changing  to  a  version  that  better 
meets  that  user’s  preferences. 

An  adaptation-aware  apphcation  differs  from  an 
adaptive  apphcation  in  two  ways.  Lirst,  when  it  is 
terminated,  the  new  version  need  not  be  restarted  from  the 
beginning.  Instead,  a  different  version  from  the  one  that 
terminated  may  be  started,  using  information  about  a 
previous  state  that  was  obtained  from  the  execution  of  the 
previous  version.  Second,  an  adaptation-aware  apphcation 
may  be  able  to  adapt  its  resource  usage  during  execution, 
without  restarting. 

Linally,  MSHN’s  goals  differ  from  SmartNet’ s  in  that 
MSHN  seeks  to  determine  how  to  meet  multiple  different 
QoS  requirements  to  multiple  different  apphcations 
simultaneously.  There  are  reahy  two  issues  bound  up  in 
this  difference.  Lirst,  a  way  to  incorporate,  dynamically, 
the  mixture  of  QoS  requirements  into  a  single  measure 
must  be  determined.  Second,  an  assignment  of 
apphcations  to  resources  must  also  be  determined  that 
optimizes  the  identified  measure.  In  resolving  this  second 
issue,  we  can  strongly  leverage  SmartNet’ s  emphasis  on 
the  separation  of  optimization  criteria  and  search 
algorithms  and  the  recognition  that  similar  algorithms  can 
be  used  to  search  many  different  types  of  spaces  for 
optimal  values.  We  elaborate  on  this  below. 

1.3.  Related  work 

There  are  other  research  groups  examining  the  issues 
important  to  building  an  RMS,  many  within  DARPA’s 
Quorum  project.  Here,  we  look  at  some  of  the  projects 
related  to  MSHN.  Some  of  these  groups  are  engaged  in 
research  complementary  to  MSHN’s  goals.  Lor  the  sake 
of  brevity,  only  a  short  synopsis  of  each  project,  as  it 
relates  to  MSHN,  is  presented. 

DeSiDeRaTa.  The  University  of  Texas  at  Arhngton 
has  a  project  called  “DeSiDeRaTa:  QoS  Management 
Tools  for  Dynamic,  Scalable,  Dependable,  Real-Time 


^  We  note  that  a  version  of  one  application  may  be  terminated  because 
MSHN  detects  that  another  user’s  application  will  not  meet  its  QoS 
expectations.  This  phenomenon  can  occur  due  to  priorities. 


Systems.”  DeSiDeRaTa  is  focusing  on  QoS  specification, 
QoS  metrics,  dynamic  QoS  management,  and 
benchmarking  of  specific  computing  environments,  such 
as  the  distributed  Anti- Air- Warfare  system  at  the  Naval 
Surface  Warfare  Center,  Dahgren  Division.  A  unique 
concept  that  has  come  out  of  the  DeSiDeRaTa  project  is 
that  of  an  apphcation  “path”  [56]. 

Globus.  Globus  is  a  large,  joint  project  from  Argonne 
National  Laboratory  and  the  University  of  Southern 
Cahfornia's  Information  Sciences  Institute.  Parts  of  the 
Globus  project  are  devoted  toward  resource  management 
issues.  The  Globus  architecture  depends  on  an  advance  or 
immediate  resource  reservation  protocol  layer,  for  which  a 
standard  does  not  yet  exist  [14] [18]. 

RT-ARM.  Honeywell  is  developing  a  “Real-Time 
Adaptive  Resource  Management”  system  aimed  primarily 
at  high-end,  real-time  military  embedded  systems  such  as 
the  Navy  Surface  Combatant  Ship  SC-21.  Some  of  the 
specific  issues  they  are  concentrating  on  include  modehng 
embedded  systems  and  finding  practical  techniques  for 
predictable  real-time  performance  [24]. 

EPIQ.  The  EPIQ  project,  from  the  University  of 
Illinois  at  Urbana-Champaign,  is  building  an  infrastructure 
for  providing  guaranteed  QoS  features,  upon  which  RMSs 
may  be  built,  part  of  their  infrastructure  involves  building 
their  own  runtime  environment  [35]. 

ERDoS.  SRI  International  is  running  a  project  called 
ERDoS  (End  to  End  Resource  Management  for 
Distributed  Systems)  which  is  developing  an  architecture 
for  adaptive  QoS-driven  resource  management.  The 
ERDoS  project  emphasizes  a  comprehensive  definition  of 
QoS  and  the  development  of  models  that  capture 
information  required  for  making  resource  management 
decisions  [46]. 

QUASAR.  The  QUASAR  (QUAlity  Specification  and 
Adaptive  Resource  management  for  distributed  systems) 
project,  at  the  Oregon  Graduate  Institute  of  Science  and 
Technology,  is  investigating  techniques  for  specifying  and 
utihzing  QoS  in  adaptive,  distributed  systems.  QUASAR 
is  concentrating  on  the  translation  of  QoS  specifications 
from  the  apphcation-level  to  the  resource-management- 
level,  and  its  use  in  reservation-based  resource 
management,  primarily  in  the  multimedia  domain  [53]. 

ASSERT.  The  ASSERT  System  at  the  University  of 
Oregon,  Eugene,  is  focusing  on  dynamic,  distributed,  real¬ 
time  environments.  The  core  of  the  project  estimates  and 
monitors  the  relevant  QoS  parameters  of  running 
apphcations.  ASSERT  is  not  an  RMS,  nor  an  RMS 
framework;  rather,  the  ASSERT  project  is  looking  at  a 
specific  issue  of  RMSs:  QoS  monitoring  and  estimation 
[16]. 

QuO.  The  Quahty  Objects  (QuO)  project,  from  BBN 
Systems  Technologies,  is  attempting  to  add  QoS 
specification  and  delivery  to  CORBA.  Rather  than 
provide  absolute  QoS  guarantees,  QuO  seeks  to  combine 


knowledge  about  resource  and  apphcation  conditions  in 
order  to  reserve  enough  end-to-end  resources  for 
predictable  execution  of  distributed  apphcations  [52]. 

MOL.  The  MOL  (Metacomputing  OnLine)  project 
from  the  Paderborn  Center  for  Parahel  Computing  has  as  a 
goal  the  utihzation  of  multiple  high  performance  systems 
for  solving  problems  too  large  for  a  single  supercomputer. 
The  MOL  approach  does  not  assume  absolute  control  of 
resources  under  its  management.  The  MOL  project  is 
addressing  several  of  the  issues  key  to  resource 
management,  including  QoS  specification  [42]. 

1.4.  Organization  of  the  paper 

In  the  next  section  of  the  paper  we  motivate  and  discuss 
MSHN’s  architecture.  Even  though  SmartNet  was 
successful  in  achieving  its  functionality,  rather  than  using 
SmartNet’ s  architecture  exactly,  we  based  MSHN’s 
architecture  upon  lessons  learned  from  SmartNet,  because 
MSHN’s  goals  are  substantiahy  different.  In  particular, 
we  clearly  dehneated  certain  of  SmartNet’ s  modules  into 
separate  components.  This  dehneation  makes  it  easier  to 
experiment  with  different  designs  for  each  of  the 
components.  In  section  3,  we  then  discuss  many  of  the 
research  issues  that  the  MSHN  investigators  are  studying 
and  highhght  some  of  the  results.  Additionally,  this 
section  provides  references  to  the  numerous  articles  that 
describe  this  research  in  more  detail.  We  conclude  by 
summarizing  the  status  of  the  MSHN  project. 

2.  MSHN’s  architecture 

In  this  section,  we  first  describe  some  of  the  concepts 
that  went  into  MSHN’s  architectural  design.  This 
description  motivates  the  need  for  the  various  major 
components  and  explains  why  they  must  be  rephcated  to 
varying  degrees.  The  architectural  design  was  driven  by 
the  need  to  support  the  RMS  research  that  we  will  discuss 
in  the  next  section  and  was  aided  by  our  previous 
experience  with  SmartNet.  We  then  present  MSHN’s 
current  architecture  in  detail. 

2.1.  Motivation 

We  first  motivate  the  need  for  each  of  the  major 
components  of  MSHN’s  architecture,  then  discuss  how 
those  components  interact  with  one  another. 

We  recall  from  the  previous  section  that  an  RMS  needs 
to  transparently  locate  the  resources  that  should  be  used 
when  execution  of  an  application  is  requested.  Therefore, 
it  must  be  made  aware  of  any  request,  by  either  a  user  or 
an  apphcation,  to  start  executing  another  application. 
Many  early  RMSs  required  the  user  to  exphcitly  log  in  to 
the  system  to  start  a  job.  If  an  apphcation  was  to  be 
started  from  within  another  apphcation,  e.g.,  through 


fork  and  exec  system  calls,  then  the  application  that 
makes  the  request  would  be  required  to  be  specially 
designed  to  embed  these  requests  within  a  function  call  to 
an  RMS  library.  This  restriction  required  that  apphcations 
be  specifically  written  or  modified  for  a  particular  RMS. 

The  MSHN  designers  do  not  want  to  force  a  user  to 
exphcitly  log  into  an  RMS,  or  to  modify  their  existing 
programs.  Instead,  MSHN  transparently  intercepts  calls  to 
system  libraries  that  would  otherwise  initiate  execution  of 
a  new  process  and  diverts  those  calls  to  a  MSHN  Ghent 
Library.  After  MSHN  decides  where  the  newly  requested 
apphcation  should  execute,  the  MSHN  Ghent  Library  uses 
whatever  mechanisms  available  at  the  resource  site  to 
initiate  execution  of  the  remote  process. 

The  environments  for  which  MSHN  is  designed  contain 
many  different  types  of  computers,  each  possibly 
executing  a  different  version  of  an  operating  system. 
Rather  than  requiring  the  Ghent  Library,  which  is  linked 
with  every  MSHN  apphcation,  to  contain  a  substantial 
amount  of  code  that  is  specific  to  each  of  these  computers, 
we  chose  to  make  use  of  a  MSHN  Daemon.  Whenever  a 
computer  is  added  to  a  system,  a  MSHN  Daemon  is  started 
on  that  computer.  When  a  Ghent  Library  needs  to  start  a 
process  on  a  remote  machine,  it  simply  contacts  the 
MSHN  Daemon  on  that  machine  and  requests  that  the 
Daemon  start  the  process  on  the  Ghent  Library’s  behalf. 
Of  course,  the  general  mechanism  that  we  use  in  the 
Daemon  is  not  new,  and  is  therefore  not  a  research  issue. 

When  a  remote  process  needs  to  communicate  with  the 
initiating  process,  it  contacts  the  Ghent  Library,  which 
passes  the  information  on  to  the  initiating  process,  just  as 
though  the  remote  process  were  started  locahy.  Being 
able  to  transparently  provide  this  service  to  apphcations, 
whether  or  not  they  are  command  interpreters,  requires 
that  the  Ghent  Library  intercept,  and  at  least  pre-process  if 
not  divert,  other  system  hbrary  cahs  in  addition  to  the 
previously  mentioned  exec  call.  For  example,  all  of  the 
socket  cahs  and  ah  cahs  to  open,  close,  read,  and  write 
files  must  be  intercepted  and  replaced  or  at  least  pre-  and 
post-processed. 

The  MSHN  project  required  a  mechanism  for 
intercepting  these  cahs  without  requiring  source 
modification.  We  initiahy  turned  to  the  Gondor  project  for 
help  with  this  problem  [36].  Gondor  is  a  project  at  the 
University  of  Wisconsin  that  performs  transparent 
migration  of  processes  in  a  Unix  environment.  To 
perform  this  migration,  Gondor  also  had  to  intercept  these 
cahs  to  system  hbraries.  Using  techniques  similar  to  those 
used  by  Gondor,  we  were  able  to  intercept  these  calls 
without  requiring  source  code  modification.^  The 
mechanism  is  described  in  detail  elsewhere  [44] . 


^  These  techniques,  however,  require  that  the  object  code  files  be  linked 
with  the  MSHN  Client  Library,  therefore  they  require  object  code  files. 
However,  another  tool,  the  Executable  Editing  Library  (EEL)  which 


In  addition  to  providing  a  mechanism  for  transparently 
executing  remote  processes,  the  Ghent  Library  is  in  a 
unique  position  to  passively  determine  the  status  of 
resources,  because  it  is  assumed  to  be  hnked  with  any 
apphcation  executing  in  an  environment  managed  by 
MSHN.  That  is,  the  MSHN  Ghent  Library  can  pre-  and 
post-process  system  cahs,  because  it  is  intercepting  all 
such  cahs  made  to  the  operating  system,  which  are 
executed  when  a  process  needs  to  use  a  hardware  resource. 
In  so  doing,  it  can  determine  the  low  level,  end-to-end 
QoS  that  an  apphcation  is  receiving  from  a  particular 
resource.  We  will  discuss  this  functionality  of  the  Ghent 
Library  further  in  the  next  section. 

When  the  MSHN  Ghent  Library  intercepts  a  cah  to 
execute  a  new  process,  it  must  have  some  way  of 
determining  which  resources  that  new  process  should  use, 
i.e.,  which  computer  should  primarily  be  responsible  for 
executing  the  new  process."^  Rather  than  requiring  that 
decision  to  be  made  independently  by  each  Ghent  Library 
that  is  hnked  with  each  apphcation,  we  chose  to  have  the 
Ghent  Library  first  check  the  request  against  a  hst  of 
apphcations  managed  by  MSHN.  If  the  requested 
apphcation  is  not  on  that  hst,  the  MSHN  Ghent  Library 
simply  passes  the  requested  apphcation  directly  to  the 
local  operating  system.  If  the  requested  apphcation  is  on 
that  hst,  it  instead  passes  the  request  to  the  MSHN 
Scheduhng  Advisor.  It  is  the  Scheduhng  Advisor’s  job  to 
determine  which  set  of  resources  the  newly  requested 
process  should  use. 

The  MSHN  Scheduhng  Advisor  is  itself  a  complex 
package,  associated  with  many  different  research  issues 
which  we  discuss  more  fully  in  the  next  section.  Among 
the  primary  research  issues  are;  (i)  what  criteria  should  be 
optimized  in  the  choice  of  resources?  (ii)  Because 
optimizing  the  criteria  is  likely  to  be  an  NP-complete 
problem,  if  n  is  too  large,  which  heuristic  should  be  used 
to  search  for  an  optimum  resource  assignment?  (hi)  With 
what  granularity  must  the  Scheduhng  Advisor  model  both 
the  pohcies  and  protocols  associated  with  allocation  of  the 
lower  level  resources  and  what  granularity  of  model 
should  it  use  to  define  the  resource  requirements  of  a 
process? 

For  the  Scheduhng  Advisor  to  determine  a  good 
assignment  of  resources  for  a  process,  it  must  know  both 
which  resources  and  how  much  of  each  resource  would  be 
required  for  a  process  to  execute  and  meet  its  QoS 
requirements  and  preferences.  Therefore,  to  assist  the 
Scheduhng  Advisor  in  making  its  decision  as  to  the 


evolved  from  the  University  of  Wisconsin’s  Paradyn  project  could  be 
used  to  link  an  executable  with  the  MSHN  Client  Library,  instead  [32]. 
^  In  modern  systems,  the  choice  of  computer  that  is  responsible  for 
executing  a  process  often  carries  with  it,  implicitly,  a  choice  of  file 
servers  and  other  distributed  resources  such  as  networks.  Therefore, 
when  we  say  that  MSHN  chooses  a  computer  to  be  responsible  for 
executing  a  process,  the  choice  of  other  resources  external  to  that 
computer  may  be  implicit  in  that  assignment. 


assignment  of  resources,  we  designed  both  the  MSHN 
Resource  Requirements  Database  and  the  MSHN 
Resource  Status  Server. 

The  Resource  Status  Server  is  a  quickly  changing 
repository  that  maintains  information  concerning  the 
current  availabihty  of  resources.  Information  is  stored  in 
the  Resource  Status  Server  as  a  result  of  updates  from  both 
the  MSHN  Ghent  Library  and  the  MSHN  Scheduhng 
Advisor.  The  Ghent  Library  can  update  the  Resource 
Status  Server  as  to  the  currently  perceived  status  of 
resources,  which  takes  into  account  resource  loads  due  to 
processes  other  than  those  managed  by  MSHN.  The 
Scheduhng  Advisor  can  provide  expected  future  resource 
status  based  upon  the  resources  that  it  expects  will  be  used 
by  the  apphcations  that  it  assigns.  Additionally,  the 
Resource  Status  Server  can  statisticahy  process  its  historic 
knowledge  to  make  predictions  of  resource  status  even 
further  in  the  future. 

As  compared  to  the  Resource  Status  Server,  the 
information  maintained  by  the  MSHN  Resource 
Requirements  Database  changes  much  more  slowly.  The 
Resource  Requirements  Database  is  responsible  for 
maintaining  information  about  the  resources  that  are 
required  to  execute  a  particular  apphcation.  Although  the 
initial  MSHN  prototype  only  implements  a  single  source 
for  the  information  stored  in  this  database  (statisticahy 
analyzed  historical  information),  we  envision  that  many 
other  on-going  research  projects  will  also  serve  as  sources 
for  this  information. 

MSHN’s  current  source  for  the  information  that  is 
maintained  by  the  Resource  Requirements  Database 
comes  from  data  collected  by  the  MSHN  Ghent  Library 
when  the  apphcation  was  previously  executed.  Although 
patterned  after  SmartNet  in  this  way,  and  leveraging  the 
concept  of  compute  characteristics  that  SmartNet 
pioneered,  MSHN  does  not  collect  the  same  information 
as  SmartNet  collects.  SmartNet’ s  information  is  coarse¬ 
grained;  that  is,  it  maintains  only  the  total  amount  of  wall- 
clock  time  that  is  required  to  execute  a  program  from 
beginning  to  end  for  each  particular  machine.  This 
measure  is  sufficient  for  SmartNet’ s  needs  due  to  the 
requirements  of  its  intended  apphcations  (three  phases) 
and  the  expected  environment  (each  job  has  exclusive 
access  to  the  resources  that  it  is  using).  However,  in 
MSHN,  resources  are  shared  and  apphcations  have  more 
phases,  so  maintaining  only  this  coarse  grain  information 
is  insufficient.  Therefore,  the  Resource  Requirements 
Database  has  the  abihty  to  maintain  very  fine  grain 
information  collected  by  the  MSHN  Ghent  Library. 
Eventuahy  it  is  hoped  that  the  Resource  Requirements 
Database  can  also  be  populated  with  information  from 
smart  compilers  and  possibly  advice  from  apphcation 
writers. 

Apphcations,  of  course,  are  needed  to  test  any  system. 
Unfortunately,  executables  for  many  different  platforms 


would  be  needed  to  test  MSHN’s  abihty  to  manage  them 
in  a  distributed,  heterogeneous  environment.  Producing 
such  actual  apphcations  would  require  tremendous  effort 
to  obtain  the  source  code  for  numerous  apphcations,  some 
of  which  may  be  classified  or  proprietary,  port  the  source 
code  to  the  different  platforms,  and  compile  and  hnk  them. 
We  decided  that  this  effort  was  better  spent  on  our 
research  system  itself,  so  we  looked  for  another  viable 
solution.  One  solution  that  we  considered  was  to  use 
benchmarks,  because  many  of  them  have  already  been 
ported  to  many  different  platforms.  However,  we  wanted 
to  make  sure  that  our  system  could  manage  a  wide  variety 
of  apphcations.  We  finally  settled  on  writing  a  general- 
purpose  apphcation  emulator  whose  parameters  could  be 
specified  to  cause  it  to  imitate  a  wide  variety  of 
apphcations.  We  discuss  the  problem  of  deciding  how 
best  to  construct  such  an  emulator  under  the  research 
topics  in  the  next  section. 

The  Ghent  Library,  which  is  linked  with  each  executing 
MSHN  apphcation,  informs  the  Resource  Status  Server 
about  the  current  perceived  status  of  the  resources  that  the 
apphcations  are  using.  The  Scheduhng  Advisor  informs 
the  Resource  Status  Server  only  about  the  load  that  it 
expects  the  processes,  which  it  has  scheduled,  to  place  on 
certain  resources.  However,  neither  class  of  information 
indicates  the  condition  of  resources  that  no  MSHN 
apphcation  is  currently  using  or  is  planning  on  using. 
Therefore,  we  use  a  MSHN  Apphcation  Emulator  linked 
with  the  Ghent  Library  to  obtain  information  about  the 
condition  of  such  resources. 


Figure  1  MSHN’s  conceptual  architecture. 

MSHN’s  conceptual  architecture  is  shown  in  Eigure  1. 
As  can  be  seen  in  the  figure,  every  apphcation  running 
with  MSHN  makes  use  of  the  MSHN  Ghent  Library  that 
intercepts  the  apphcation’ s  operating  system  calls.  When 
the  Ghent  Library  intercepts  a  request  to  execute  a  new 
apphcation,  and  that  apphcation  requires  that  the  MSHN 
Scheduhng  Advisor  be  consulted  to  determine  the 
resources  that  the  apphcation  should  use,  the  Ghent 


Library  invokes  a  scheduling  request  on  the  Scheduling 
Advisor.  The  Scheduhng  Advisor  queries  both  the 
Resource  Requirements  Database  and  the  Resource  Status 
Server.  It  uses  information  that  it  receives  from  them, 
along  with  an  appropriate  search  heuristic,  to  determine 
where  the  newly  requested  process  should  execute.  After 
determining  which  resources  should  host  the  new  process, 
the  Scheduhng  Advisor  returns  the  decision  to  the  Ghent 
Library,  which,  in  turn,  requests  execution  of  that  process 
through  the  appropriate  MSHN  Daemon.  The  MSHN 
Daemon  invokes  the  application  on  its  machine.  As  a 
process  executes,  the  Ghent  Library  updates  both  the 
Resource  Status  Server  and  the  Resource  Requirements 
Database  with  the  current  status  of  the  resources  and  the 
requirements  of  the  process.  Meanwhile,  the  Scheduhng 
Advisor  estabhshes  cahbacks  with  both  the  Resource 
Requirements  Database  and  the  Resource  Status  Server. 
Using  cahbacks,  the  Scheduhng  Advisor  is  notified  in  the 
event  that  either  the  status  of  the  resources  has 
significantly  changed,  or  the  actual  resource  requirements 
are  substantially  different  from  what  was  initiahy  returned 
from  the  Resource  Requirements  Database.  In  either  case, 
if  it  no  longer  appears  that  the  assigned  resources  can 
dehver  the  required  QoS,  the  application  must  be  adapted 
or  terminated.  Upon  receipt  of  a  cahback,  the  Scheduhng 
Advisor  might  require  that  several  of  the  apphcations 
adapt  so  that  more  of  them  can  receive  their  requested  or 
desired  QoS. 
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Figure  2  Physical  instantiation  of  the  MSHN 
architecture. 

Although  all  MSHN  components  could  run  on  the  same 
machine,  they  can  also  be  distributed  and  rephcated  across 
many  different  computers  using  tools  such  as  ISIS,  Horns 
and  Ensemble  [7]  [50]  [49].  Results  from  control  theory 
will  also  be  useful  here  in  ensuring  that  the  process  of 
repheating  and  merging  components  is  stable  and  does  not 
result  in  oscihation.  Additionally,  results  from  control 
theory  must  be  incorporated  into  the  rephcated  Scheduhng 
Advisor  itself  to  ensure  that  modifications  requested  of 


adaptive  and  adaptation-aware  apphcations  do  not  become 
unstable.  MSHN  components  might  even  repheate  as 
needed  [20]  [21].  Figure  2  illustrates  a  simple 
instantiation  of  the  MSHN  system. 

In  addition  to  the  components  discussed  above,  we 
found  it  convenient  to  add  a  MSHN  Visuahzer  that 
enabled  us  to  examine,  for  both  functional  and 
performance  debugging  purposes,  the  current  states  of  the 
various  MSHN  components.  The  MSHN  Visuahzer 
captures  all  significant  events  within  and  between  the  core 
MSHN  components  for  real-time  and  post-mortem 
analysis. 

Security  within  the  MSHN  architecture  has  been 
considered.  Poheies  of  interest  are: 

•  Gomponent  authentication.  This  includes 

authentication  of  MSHN  core  components  to  each 
other;  authentication  of  resource-based  chents  to 
the  MSHN  core;  and  authentication  of  apphcations 
to  selected  MSHN  components. 

•  Hierarchical  least  privilege.  Within  the  MSHN 
context,  the  core  components  are  the  most 
privileged,  while  user  apphcations  are  the  least 
privileged. 

•  Gommunications  integrity  and  confidentiality. 
Gommunications  are  protected  from  unauthorized 
modification  and  disclosure. 

•  Access  control.  Access  to  MSHN  core  databases 
and  to  job  histories  may  be  mediated. 

The  security  architecture  creates  keyed  domains, 
supporting  least  privilege,  authentication,  confidentiahty 
and  integrity  by  using  the  Gommon  Data  Security 
Architecture  facihties  for  security  services  and  key 
management^  [57]  [58]. 

2.2.1.  The  current  MSHN  architecture.  A  high  level 
description  of  the  current  MSHN  architecture  is  presented. 
For  a  more  detailed  description,  we  refer  the  reader  to 
other  pubheations  [43].  High-level  diagrams  are  presented 
for  each  MSHN  component,  with  arrows  indicating  the 
direction  of  communication  or  action.  In  addition  to  these 
diagrams,  a  short  description  of  each  component’s 
functions  is  given.  In  the  description  of  the  MSHN 
architecture,  we  represent  MSHN  components  and 
external  components  as  Unified  Modehng  Language 
(UML)  actors  [8].  The  symbols  used  for  this 
representation  are  shown  in  Figure  3.  The  core  MSHN 
components  include  the  Scheduhng  Advisor  (SA),  the 
Ghent  Library  (GL),  the  Resource  Status  Server  (RSS),  the 
Resource  Requirements  Database  (RRD),  the  Daemon  (D) 
and  the  Appheation  Emulator  (AE). 


^As  in  any  RMS,  assurance  of  MSHN's  security  properties  is  built  on  and 
limited  by  the  effectiveness  of  the  security  environment  provided  by  the 
underlying  operating  system(s)  and  hardware  base(s). 
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Figure  3  Symbols  representing  actors  in  the 
MSHN  architecture. 


Scheduling  Advisor  (SA)  functionality.  The  primary 
responsibility  of  the  SA  is  to  determine  the  best 
assignment  of  resources  to  a  set  of  applications,  based  on 
the  optimization  of  a  global  measure,  which  we  describe  in 
the  next  section.  The  SA  depends  on  the  RRD  and  the 
RSS  in  order  to  identify  an  operating  point  that  optimizes 
the  global  measure.  It  responds  to  resource  assignment 
requests  from  the  CL.  When  appropriate,  the  SA  requests 
application  adaptations  via  the  CL.  The  SA  is  also 
responsible  for  establishing  callback  criteria  (thresholds) 
with  the  RSS  and  RRD.  All  MSHN  components  update 
the  MSHN  Visuahzer  with  all  significant  display  and  post¬ 
mortem  analysis  events. 


Client  Library  (CL)  functionality.  The  CL  is  hnked 
with  both  adaptive  and  adaptation-aware  apphcations.  It 
provides  a  transparent  interface  to  all  of  the  other  MSHN 
components.  The  CL  intercepts  system  caUs  to  collect 
resource  usage  and  status  information,  which  is  forwarded 
to  the  RRD  and  the  RSS.  The  CL  also  intercepts  calls  that 
initiate  new  processes  (such  as  exec  ( ) )  and  consults  the 
SA  for  the  best  place  to  start  that  process.  It  requests 
(possibly  remote)  daemons  to  execute  apphcations  based 
on  the  SA’s  advice.  The  CL  invokes  adaptation  on 
adaptation-aware  apphcations  when  notified  by  the  SA  via 
caUbacks.  One  such  invocation  is  the  special  case  of 
setting  emulator  parameters. 


Resource  Status  Server  (RSS)  functionality.  The  role 
of  the  RSS  is  to  maintain  a  repository  of  the  three  types  of 
information  about  the  resources  available  to  MSHN: 
relatively  static  (long-term),  moderately  dynamic 
(medium-term),  and  highly  dynamic  (long-term) 
information.  The  RSS  is  updated  with  current  data  via  the 
CL  or  through  a  system  administrator.  The  RSS  responds 
to  SA  requests  with  estimates  of  currently  available 
resources.  The  SA  sets  up  caUbacks  with  the  RSS  based 
on  resource  availabihty  thresholds  and  CL  update 
frequency  requirements. 


Resource  Requirements  Database  (RRD) 
functionality.  The  RRD  is  a  repository  of  information 
pertaining  to  the  resource  usage  of  apphcations.  The  RRD 
provides  this  information  to  the  SA.  CaUbacks  to  the  SA 
are  based  on  either  the  occurrence  of  a  threshold  violation 
or  update  frequency  requirements.  It  is  updated  by  the 
CL. 


Daemon  (D)  functionality.  The  MSHN  Daemon 
executes  on  all  compute  resources  available  for  use  by  the 
SA.  Its  sole  purpose  is  to  start  apphcations  as  requested 
by  the  CL.  It  therefore  has  the  capabihty  and 
responsibihty  of  initiating  the  default  apphcation  emulator 
at  start-up  to  determine  resource  status  information. 


Application  Emulator  (AE)  functionality.  The  AE 

emulates  a  running  application  by  stressing  particular 
resources  in  the  same  way  as  the  real  application  does. 
The  AE  serves  two  purposes:  The  first  is  to  run  simulated 
apphcations  (that  statistically  leave  the  same  resource 
usage  footprint  of  the  real  apphcations)  without  the 
overhead  and  uncertainty  of  actually  instalhng, 
maintaining,  and  running  that  particular  apphcation.  The 
second  is  to  be  a  monitor,  in  the  absence  of  any  other 
MSHN-scheduled  apphcations.  That  is,  it  can  determine 
the  status  of  resources  that  are  not  being  otherwise  used  by 
MSHN-scheduled  apphcations,  and  therefore  not  being 
monitored  by  an  existing  CL.  The  Daemon  starts  one 
instance  of  the  AE,  by  default,  at  startup.  Other  instances 
may  be  started  at  any  other  time  through  a  command 
interpreter  or  other  apphcation. 


3.  MSHN  Research  Issues 

In  this  section  of  the  paper,  we  describe  some  of  the 
major  issues  being  investigated  by  the  MSHN  team 
members.  We  also  briefly  summarize  some  of  the  results 
to  date.  Of  course,  there  is  not  sufficient  space  to 
completely  describe  all  of  the  issues  and  results  in  detail, 
so  the  reader  is  also  referred  to  relevant  papers  on  each 
topic.  We  have  attempted  to  associate  the  issues  with  the 
component  of  the  MSHN  architecture  that  they  most 
strongly  affect.  However,  certainly  many  issues  that  affect 
the  Scheduhng  Advisor  also  affect  the  Resource  Status 
Server  and  Resource  Requirements  Database. 
Additionally,  this  work  is  non-orthogonal  to  research 
being  done  by  many  investigators  outside  of  the  MSHN 
team  who  are  examining  such  issues  as  how  QoS 
requirements  are  derived  from  smart  compilers  and  how 
they  can  be  best  expressed. 

3.1.  Scheduling  Advisor  research  issues 

In  this  section  we  discuss  some  issues  that  most 
strongly  affect  the  Scheduling  Advisor.  Eirst,  we  examine 
how  to  quantify  the  needs  of  all  of  the  processes  that 
require  resource  allocation  by  the  Scheduhng  Advisor. 


Then,  we  consider  the  ramifications  of  not  precisely 
knowing  the  resource  requirements,  and  consequently,  the 
exact  future  status  of  all  of  the  resources.  Einally,  we 
discuss  the  class  of  heuristics  that  have  thus  far  been 
implemented  in  MSHN  and  why  there  is  a  need  for  a 
variety  of  heuristics. 

3.1.1.  Optimization  criteria.  Optimal  resource 
aUocation  always  involves  attempting  to  solve  an 
optimization  problem,  which  is  usually  NP-complete. 
SmartNet’s  primary  optimization  criterion  was  to 
minimize  the  time  at  which  an  apphcation  completes, 
assuming  that  all  of  the  apphcations  were  of  a  particular 
form.  Later  versions  of  SmartNet  also  accounted  for 
priorities.  MSHN  maximizes  a  weighted  sum  of  values 
that  represents  the  benefits  and  costs  of  dehvering  the 
required  and  desired  QoS  (including  security,  priorities, 
and  preferences  for  versions),  within  the  specified 
deadhnes,  if  any.  We  now  discuss  the  effect  of  each  of 
these  attributes  on  the  optimization  criteria. 

•  MSHN’s  consideration  of  security  as  an 
optimization  criterion  ahows  the  trade-off  of 
security  with  other  QoS  constraints  when  there  are 
insufficient  resources  to  complete  all  requests. 
This  is  done  in  a  fashion  similar  to  other  recent 
projects  [45].  MSHN  associates  a  cost  to  security 
levels  that  varies,  depending  upon  which  resources 
are  being  used  to  obtain  a  given  level  of  security 
(for  more  details  on  security  viewed  as  a  QoS 
parameter,  see  section  3.2). 

•  MSHN  attempts  to  account  for  both  preferences  for 
various  versions  and  priorities.  That  is,  when  it  is 
impossible  to  dehver  all  of  the  most  preferred 
information  within  the  specified  deadhnes  due  to 
insufficient  resources,  MSHN’s  optimization 
criteria  are  designed  to  favor  dehvering  the  most 
preferred  version  to  the  highest  priority 
apphcations. 

•  In  MSHN’s  optimization  criteria,  deadhnes  can  be 
simple  or  complex.  That  is,  sometimes  a  user  will 
be  satisfied  if  a  result  is  received  before  a  specific 
time.  Other  times,  a  user  would  hke  to  associate  a 
more  general  benefit  function,  which  would 
indicate  that  information  might  have  different 
values  based  upon  when  it  is  received. 

Eurther  information  about  MSHN’s  optimization 
criteria  can  be  found  elsewhere  [22]  [30]. 

In  addition  to  a  cost  function  that  is  optimized, 
optimization  problems  usually  have  a  set  of  constraints 
that  must  be  met  in  order  for  a  solution  to  be  viable.  The 
constraints  of  a  resource  ahocation  optimization  problem 
are  that  the  resources  ahocated  to  meet  the  needs  of  the 
processes  must  be  less  than  or  equal  to  the  available 
resources  at  any  point  in  time.  The  actual  inequahties 
required  not  only  depend  upon  the  QoS  constraints,  but 


also  upon  the  sharing  pohcies  used  by  the  local  operating 
systems  and  network  protocols,  and  upon  the  granularity 
with  which  both  those  pohcies  and  resource  usage  should 
be  known  (see  Granularity  Issues  in  Section  3.2). 

3.1.2.  Inexact  knowledge  of  job  resource  usage.  Even 
if  it  is  possible  to  find  a  perfect  solution  to  the 
optimization  problem  that  is  posed  by  instantiating  the 
constraints  and  optimization  criteria  to  the  current 
situation,  the  expected  resource  usage  of  any  given 
apphcation  is  often  only  an  estimate.  In  real-time  systems, 
the  worst  case  estimate  is  often  used  to  assign  resources  to 
processes;  however,  many  other  systems  use  the  mean 
expected  resource  usage.  Our  recent  analysis  has  revealed 
that  using  the  mean  will  cause  the  actual  run-time  to  be 
generaUy  underestimated  and  that  a  better  assignment  can 
be  made  if  both  the  mean  and  distribution  of  the  expected 
resource  usage  is  accounted  for,  when  appropriate  [28] . 

This  leads  to  another  question  concerning  whether  the 
extra  complexity  involved  in  using  a  sophisticated 
heuristic  will  yield  a  better  schedule  than  using  a  simple 
heuristic  if  the  actual  variance  of  run-times  is  large,  and 
scheduling  is  done  using  the  mean,  or  both  the  mean  and 
the  distribution.  Our  recent  results  in  this  area  have  shown 
that,  in  many  cases,  complex  heuristics  can  determine 
schedules  that,  when  executed,  sometimes  perform  much 
better  than  the  schedules  derived  from  very  simple 
heuristics,  even  when  the  variance  is  large.  However, 
sometimes  very  simple  heuristics  perform  just  as  well  as 
the  more  complex  ones.  The  difference  in  quahty  of  the 
schedules  produced  by  the  various  heuristics  was  found  to 
be  closely  correlated  with  the  type  of  heterogeneity  in  a 
system.  For  example,  when  both  the  machine  and 
apphcation  heterogeneity  is  very  low,  a  simple  heuristic 
performs  just  as  well  as  more  complex  ones.  Several 
papers  have  described  our  results  concerning  this  research 
[2][3][10][40]. 

3.1.3.  Performance  of  search  algorithms.  SmartNet’s 
organization  leveraged  the  idea  of  independence  of  search 
algorithms  and  optimization  criteria.  That  is,  most 
heuristics  for  searching  the  space  of  mappings  can  be 
modified  to  search  for  solutions  to  different  optimizations 
within  the  same  space.  For  example,  Dantzig’s  Simplex 
Method  is  useful  with  all  problems  whose  optimization 
criteria  and  constraint  inequahties  can  be  stated  using  only 
hnear  combinations  of  the  variables.  Sometimes,  many 
different  heuristics  will  work,  but,  depending  upon  the 
characteristics  of  a  given  problem,  certain  heuristics  may 
be  preferable  to  others.  For  example,  the  MSHN  team  has 
obtained  extensive  results  identifying  the  regions  of 
heterogeneity  where  certain  heuristics  perform  better  than 
others  for  maximizing  throughput  by  minimizing  the  time 
at  which  the  last  apphcation,  of  a  set  of  apphcations, 
should  complete  [2]  [3]  [10]  [40].  Re- targeting  of  these 


heuristics  to  other  optimization  criteria  is  currently 
underway. 

Additionally,  MSHN  team  members  have  performed 
extensive  research  into  accounting  for  dependencies 
between  apphcations  or  processes  that  make  up  a  single 
apphcation  [40]  [47]  [48]  [54] .  This  includes  promising 
results  from  investigating  data  dependencies  and  mapping 
of  iterative  apphcations  [1  ]  [4]  [5]  [6]  [  1 1  ] . 

3.2.  Resource  Status  Server  and  Resource 
Requirements  Database  research  issues 

Part  of  the  MSHN  team’s  investigation  has  been  aimed 
at  determining  what  information  should  be  stored  in  the 
Resource  Requirements  Database  and  maintained  by  the 
Resource  Status  Server.  First,  a  taxonomy  for  the  types  of 
information  that  could  be  stored  there  was  required.  We 
discuss  this  taxonomy  below.  We  also  discuss  the  impact 
that  viewing  security  as  a  QoS  has  on  these  two  MSHN 
components.  Finally,  one  of  the  most  important  issues  in 
designing  effective  RMSs  is  determining  the  level  of 
granularity  of  information  that  must  be  maintained 
concerning  the  status  of  resources  and  the  requirements  of 
apphcations.  We  now  discuss  each  of  these  issues  in 
somewhat  more  detail  and  refer  the  interested  reader  to 
relevant  pubhcations. 

3.2.1.  A  taxonomy.  The  MSHN  team  has  formulated  a 
three-part  taxonomy  for  classifying  systems.  The  three 
different  components  include  methods  for  describing  the 
apphcations,  the  computing  environment,  and  the  mapping 
strategy  that  is  used.  Some  of  the  relevant  characteristics 
that  need  to  be  instantiated  concerning  each  apphcation 
include 

(i)  Its  size,  that  is  the  number  of  tasks  or  sub-tasks 
associated  with  it. 

(ii)  Whether  the  sub-tasks  are  independent  of  one 
another  or,  if  they  are  dependent,  the  types  of 
dependencies. 

(iii)  The  I/O  distributions  of  the  apphcation  and  the 
sources  of  the  I/O,  i.e.,  whether  it  performs  ah 
input  in  the  beginning  and  all  output  at  the  end  or 
whether  one  or  the  other  is  performed 
continually  throughout  the  lifetime  of  the 
processes  and  whether  the  input  data  is  obtained 
through  interacting  with  a  person  or  some  other 
source  that  has  highly  variable  response  times. 

(iv)  The  deadhnes  and  other  QoS  requirements, 
including  security,  if  any,  associated  with  the 
apphcations  and/or  the  subtasks  that  comprise 
the  apphcation. 

Similarly,  the  computing  environments  and  mapping 
strategies  have  numerous,  hierarchicahy  characterizable, 
attributes  that  are  more  fully  documented  in  other 
pubhcations  [9]. 


3.2.2.  Security  as  a  quality  of  service.  Security  in  the 
context  of  QoS  is  a  current  research  area  [34]  [45].  The 
security  capabihties  of  resources  and  security 
requirements  of  apphcations  must  influence  the 
assignment  of  apphcations  to  resources.  We  can  obtain 
information  concerning  the  user  security  requirements 
from  the  Resource  Requirements  Database  and 
information  concerning  the  security  capabihties  of  the 
resource  from  the  Resource  Status  Server.  For  example,  if 
the  output  of  an  apphcation  must  be  encrypted  using  a 
particular  algorithm,  with  a  key  size  chosen  within  a 
particular  range,  then  that  requirement  must  be  stored  in 
the  Resource  Requirements  Database  along  with  the 
amount  of  data  that  must  be  encrypted.  Also,  the 
Resource  Status  Server  must  know  whether  each  particular 
computing  resource  is  capable  of  performing  the  required 
cryptographic  algorithm  and  the  cost,  in  terms  of  run-time 
per  byte,  for  example,  of  encrypting  the  data.  Members  of 
the  MSHN  team  have  developed  an  initial  framework, 
which  they  are  currently  refining,  for  characterizing  the 
overall  security  attributes  of  a  network  and  for 
determining  a  cost  and  benefit  value  for  providing 
required  and  preferred  security  to  an  apphcation 
[26]  [27]  [33]  [34]. 

3.2.3.  Granularity  issues.  Another  very  important 
question  that  concerns  both  the  Resource  Requirements 
Database  and  the  Resource  Status  Server  has  to  do  with 
how  much  detail  should  be  maintained  concerning  the 
status  of  resources  and  the  requirements  of  apphcations. 
Obviously,  while  a  very  accurate,  detailed  set  of 
information  might  prove  quite  useful  to  the  scheduhng 
algorithms,  it  would  be  at  the  least  very  expensive  and 
difficult  to  collect  if  not  expensive  to  process  within  the 
algorithm  itself. 

The  MSHN  team  has  obtained  initial  estimates  for  the 
overhead  of  capturing  system  caUs  to  determine  the  cost  of 
collecting  various  granularities  of  such  information  [44]. 
Members  of  the  team  are  currently  using  this  technique  to 
record  fine-grained  information  for  a  program  that 
analyzes  air  tasking  orders  and  will  report  both  the 
information  concerning  the  resources  that  were  used,  as 
well  as  the  overhead  involved  in  collecting  the  resource 
usage  information  [41]. 

In  addition  to  the  cost  associated  with  collecting  fine¬ 
grained  information  concerning  apphcations’  use  of 
resources,  there  is  the  question  of  how  much  information 
is  sufficient.  Current  experiments  of  the  MSHN  team 
focus  on  determining  whether  fairly  simple  models  can  be 
used  to  predict  the  relative  performance  of 
application/resource  assignments.  To  perform  realistic 
experiments,  the  team  has  built  an  initial  apphcation 
emulator  (see  below)  and  is  actually  executing  it  with 
different  parameters  on  different  systems,  using  ah 


possible  configurations  to  compare  the  actual  received 
QoS  to  the  predicted  QoS.  Thus  far  we  have  determined 
that  the  Resource  Status  Server  must,  directly  or 
indirectly,  contain  information  concerning  whether  native 
threads  are  supported  by  the  operating  system.  If  this 
information  is  not  maintained,  the  scheduhng  algorithm, 
which  must  choose  between  two  platforms  that  are 
identical  except  for  the  operating  system  version  that  they 
execute,  may  assign  a  process  which  could  be  handled 
better  by  one  platform  to  the  other.  Similarly,  the 
Resource  Requirements  Database  must  indicate  whether  or 
not  the  apphcation  is  multi-threaded  and  the  number  and 
nature  of  threads  that  it  uses.  Information  concerning 
these  results  can  be  found  in  other  pubhcations  [12]. 

3.3.  Application  Emulator  research  issues 

The  MSHN  team  is  designing  and  implementing  an 
apphcation  emulator  for  two  different  reasons.  One  reason 
is  that  it  is  needed  within  the  MSHN  architecture  to 
monitor  the  end-to-end  status  of  the  resources.  The  other 
reason  is  to  be  able  to  easily  construct  a  very  large  suite  of 
apphcation  emulators  that  place  loads  on  resources  in  the 
same  way  that  the  actual  apphcations  would.  When  used 
in  conjunction  with  resource  usage  measurements  from 
hnking  actual  apphcations  to  MSHN’s  Client  Library,  the 
MSHN  Apphcation  Emulator  can  be  used  to  emulate  the 
execution  of  the  actual  apphcations  without  requiring  the 
apphcations  to  actuahy  be  ported  to  many  different 
platforms.  The  obvious  advantage  of  using  such  an 
apphcation  emulator,  rather  than  porting  the  apphcations 
themselves,  is  to  enable  the  MSHN  researchers  to  test  their 
architecture  more  quickly  under  many  different  situations. 

To  meet  the  first  purpose  of  the  MSHN  Apphcation 
Emulator,  we  first  had  to  define  the  meaning  of  loading 
resources  for  various  resources.  Percentages  cannot  be 
used,  as  they  are  not  transferable  between  either 
computing  platforms  or  network  media.  Rather,  each 
category  of  resource  was  identified  and  units  that  can  be 
most  easily  translated  between  different  platforms,  such  as 
FLOPS  and  bytes/sec,  were  chosen  to  quantify  resource 
use.  Also  recognized  at  this  stage  was  the  need  to  have 
both  multi-threaded  and  non-multi-threaded  apphcation 
emulator  capabihty.  Finally,  not  only  can  a  single 
apphcation  be  comprised  of  multiple  threads,  but  it  can 
also  be  comprised  of  multiple  heavy-weight  processes. 

When  designing  the  Apphcation  Emulator  to  meet  both 
of  its  requirements,  we  recognized  that  distributions 
reflecting  communication  and  computation  alone  were 
insufficient;  conditional  probabilities  were  required.  That 
is,  many  times  the  purpose  of  one  process  sending  a 
message  to  another  process  is  so  that  the  receiving  process 
will  perform  work  on  behalf  of  the  sending  process. 
Therefore,  we  designed  our  most  general  emulator  to  also 
have  the  capabihty  of  sending  work-bearing  messages. 


To  this  end,  we  have  completed  an  initial 
implementation  of  an  apphcation  emulator  that  we  have 
used  for  our  granularity  research  and  are  testing  the  more 
general  apphcation  emulator.  Documentation  concerning 
both  of  these  apphcation  emulators  can  be  found 
elsewhere  [12]  [15]. 

3.4.  Client  Library  research  issues 

The  research  issues  having  to  do  with  the  Ghent 
Library  component  involve  both  mechanism  and  pohcy. 
The  mechanism  issues  have  to  do  with  how  to 
transparently  hnk  the  Ghent  Library  with  apphcations. 
Previous  research  in  the  areas  of  process  migration  and 
tools  for  debugging  parahel  and  distributed  programs 
provide  us  with  easy  solutions,  as  mentioned  earher. 
Therefore,  the  only  issue  that  remains  is  how  best  to 
transparently  determine  the  end-to-end  availabihty  of 
resources.  First,  simply  determining  that  the  Ghent 
Library  could  perform  this  functionality  better  than 
providing  the  functionahty  external  to  the  apphcations 
themselves  is  an  important  contribution.  However, 
determining  the  average  end-to-end  availabihty  of  a 
network  resource  is  not  a  trivial  problem.  The  MSHN 
team’s  initial  progress  in  this  area  has  already  been 
detailed  elsewhere  [30]  [31]  [44]. 

4.  Summary  and  future  work 

In  this  paper  we  summarized  the  purpose  of  a  resource 
management  system  (RMS)  in  general  and  the  research 
goals  of  one  particular  experimental  RMS,  the 
Management  System  for  Heterogeneous  Networks 
(MSHN).  Motivation  was  provided  for  ah  of  the  major 
components  of  MSHN,  and  the  architecture  that  contains 
those  components  was  explained.  Some  of  the  research 
questions  that  the  MSHN  researchers  are  seeking  answers 
to  were  described.  References  were  provided  that  enable 
the  reader  to  better  understand  MSHN,  and  to  learn  more 
about  the  MSHN  experiments.  There  are  many  other 
interesting  RMS  research  projects  in  progress  today,  but 
space  permitted  us  to  survey  only  a  few  of  them.  In 
addition  to  continuing  the  on-going  experiments  described 
in  the  paper,  future  MSHN  investigation  will  focus  on  (i) 
reaching  a  better  understanding  of  the  level  of  granularity 
obtainable  from  apphcations  and  the  level  required  to 
perform  sufficiently  good  resource  assignment;  (ii)  more 
detailed  characterization  of  security  costing  and  metrics; 
and  (hi)  determining  the  best  search  algorithms  to  use  for 
the  MSHN  optimization  criteria  under  various  conditions. 
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